Speaker identification and localization using shuffled MFCC features and deep learning

نویسندگان

چکیده

Abstract The use of machine learning in automatic speaker identification and localization systems has recently seen significant advances. However, this progress comes at the cost using complex models, computations, increasing number microphone arrays training data. Therefore, work, we propose a new end-to-end model based on simple fully connected deep neural network (FC-DNN) just two input microphones. This can jointly or separately localize identify an active with high accuracy single multi-speaker scenarios by exploiting data augmentation approach. In regard, novel Mel Frequency Cepstral Coefficients (MFCC) feature called Shuffled MFCC (SHMFCC) its variant Difference (DSHMFCC). order to test our approach, analyzed performance proposed features different noise reverberation conditions for scenarios. results show that approach achieves these scenarios, outperforms baseline conventional methods, robustness even small-sized

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Independent Speaker Modeling and Identification Based On MFCC Features

In this gives an overview of automatic speaker recognition technology, with an emphasis on textindependent recognition. Speaker recognition has been studied actively for several decades. We give an overview of both the classical and the state-of-the-art methods. We start with the fundamentals of automatic speaker recognition, concerning feature extraction and speaker modeling. Here, describe a ...

متن کامل

Text Dependent Speaker Recognition using MFCC features and BPANN

Mel-Frequency Cepstral Coefficients are spectral feature which are widely used for speaker recognition and text dependent speaker recognition systems are the most accurate in voice based authentication systems. In this paper, a text dependent speaker recognition method is developed. MFCCs are computed for a selected sentence. The first 13 MFCCs are considered for each frames of duration 26ms an...

متن کامل

SVM based Emotional Speaker Recognition using MFCC-SDC Features

Enhancing the performance of emotional speaker recognition process has witnessed an increasing interest in the last years. This paper highlights a methodology for speaker recognition under different emotional states based on the multiclass Support Vector Machine (SVM) classifier. We compare two feature extraction methods which are used to represent emotional speech utterances in order to obtain...

متن کامل

MFCC Based Text-Dependent Speaker Identification Using BPNN

Speech processing has emerged as one of the important application area of digital signal processing. Various fields for research in speech processing are speech recognition, speaker recognition, speech synthesis, speech coding etc. Speaker recognition is one of the most useful and popular biometric recognition techniques in the world especially related to areas in which security is a major conc...

متن کامل

Automatic Speaker Recognition using LPCC and MFCC

A person's voice contains various parameters that convey information such as emotion, gender, attitude, health and identity. This report talks about speaker recognition which deals with the subject of identifying a person based on their unique voiceprint present in their speech data. Pre-processing of the speech signal is performed before voice feature extraction. This process ensures the voice...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Speech Technology

سال: 2023

ISSN: ['1381-2416', '1572-8110']

DOI: https://doi.org/10.1007/s10772-023-10023-2